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SUMMARY 



The Air Force*s Learning Abilities Measurement Program (LAMP) conducts basic research on the 



nature of^iSin'learmnglBiUtiesrwithnhendtm 



selection and classification system. To date, studies in the program have investigated the relationship 
between aptitude measures and performance on simple learning tasks. One limitation to these studies 
is that it may be inappropriate to generalize results obtained to an operational setting. Thus, futiure 
efforts will validate the aptitude tests against more complex learning such as computer programming, 
electronic troubleshooting, flight engineering, and air traffic control. 

Before the newer effort is underway, it Is critical to ^ve serious attention to the question of how 
learning might be measured in more complex environments. In this paper, we demonstrate how 
learning indicators may be derived from a taxonomy of learning to ensure that a wide range of learning 
outcomes vwll be assessed during instruction. The paper first reviews existing taxonomies, and points 
out their limitations. A taxonomy is then proposed based on a synthesis of current thought regarding 
the forms of knowledge, the types of learning activities, the uaportance of tHe domain, and the effects 
of the learner's style. The taxonomy is applied to analyze some computerized instructional programs 
that attempt to measure student learmng, and show how the programs might be improved by 
measuring a broader variety of learning outcomes. The paper concludes by speculating about how the 
taxonomy aids consideration of a broad variety of questions concerning the relationships between basic 
cognitive skills and learning outcomes, and the relationships among different kinds of learning 
experiences. 
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!• INTRODUCTION 



What is the relationship between mtelligence and learning ability? This question engaged 
contributors to the original Learning and Individual Differences^ and we believe (and hope to show 
how) the sophistication of the answer to this question highlights, perhaps as clearly as to any other 
question, exactly how far our theories have come over the last 20 years. 

Until recently, and certainly in evidence throughout that previous volume, the typical response to 
such a question might verj' well have been/!there_ is aq relation and the ability 

to learn" or "the relationship is weak at best." This position reflects conclusions drawn from the v/idely 
cited series of studies by Woodrow (1946), who found that vwth extended practice on a variety of 
learning tests (e.g., canceUng tasks, analogies, addition), the performance of brighter students did not 
improve at a rate substantially greater than that shown by poorer students. Woodrow's studies are no 
longer viewed as Incontrovertible in addressing the intelligence'Ieaming issue, primarily because of 
problems with the measures of learning ability he employed: His learning tasks may have been too 
simple (Campione, Brown, & Bryant, 198S; Humphreys, 1979) and his conception of learning as 
improvement due to practice was too simplistic Had he selected other kinds of learning tasks, and 
measured learning wth other performance indices, his results might have been quite different, as 
subsequent investigation has shown (e.g.. Snow, Kyllonen, & Marshaiek, 1984). 

A' general conclusion may be drawn here: To address questions regarding Ic-aming ability, such as 
the question of its correlates, and its dimensionality, it is important to have a clear idea of exactly what 
is meant by learning ability, to the point of being able to specify learning indicators. Problems and 
confusions such as those mtroduccd by Woodrow could ^ ive been resolved by selectmg learning 
indicators from an agreed-upon taxonomy of learning skills^. 

^For the purposes of this paper we distinp;uish learning abilities from learning skills. We define 
abilities as individual-difference dimensions m a factor analysis of learning tasks* We define skills as 
candidate individual-difference dimensions which are presently only conceptually distinct In this.way> 
we believe that proposing learning skills i$ logically pnor to establishing the individual differences 
dhnensions underlying learning. Proposing a learning skills taxonomy ^ould assist in determining the 
dimensions of learning ability. We realize that our use of the terms abilities and skills may be 
somewhat idiosyncratic* 
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Indeed, there ar^ many potential benefits to having a ^videty accepted taxonomy of learning skills. 
Consider Bloom's (1956) Taxonomy of Educational Objectives. Its primary purpose v/as to serve as an 
aid» especially to teachers, for considering a wider range of potential instructional goals and for 
considering means for evaluating student achievement consistent ^vith those goals. Although the 
taxonomy has been criticized for vagueness (Ennis, 1986), it has served teachers well over the last 30 
years, at least as demonstrated by its contmued mdusion m teacher training ciuricula. Its main effect 
*has^probably beento encourage mstructmgand.testing,of.higher-orderJhini^^ 
synthesis, evaluation). A taxonomy of leamiiig skills could have a parallel effect in es^coura^g the 
development of instructional objectives concerned with teaching higher-order learning skills. 

Fleishman and Quaintance (1984) have outlined a number of ways, both scientific and practical, m 
which a performance taxonomy in psychology would be be'iefidal. Thn main scientific benefit would 
be that results from different studies using differing methods could more easily be compared and 
synthesized. Study A flnds that some manipulation drastically affects performance on task X whereas 
study B fmds that the same manipulation has no effect on performance of task Y. Are the studies 
contradictory or compatible? A taxonomy could help one decide. 

The main practical benefit of havmg a taxonomy of leaniing skills is that consumers of research 
findings could move easily determme the limits of generalizabiiity from current research fmdings to an 
immediate practical problem. For example, it would be convenient to be able to produce leamability 
metrics for any kind of learning task, either m the classroom (e.g., a particular algebra curriculum) or 
outside the classroom (e.g., a new word processing system). A taxonomy of leaining skills would be an 
important flrst step toward achieving a generally useful leamability metric system. 

There are also more specific motivations for the immediate development of a taxonomy of learning 
skiUs. The National Assessment of Educational Progress is a biennial survey of student achievement m 
areas such as mathematics, science, dsxd computer science, designed to provide information to 
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Congress, school ofBdals, and othc^r policy makers regarding the ;5tate of American education. In 
recent years there has been inaeasing attention given to the as'iessment of higher-order sidlls in tT ^ 
subject areas (e.g., Frcderickscn & Pine, in press). It is likely that, due to political prr^urc!;^ this effort 
will contmue with or without a tsjconomy, but a taxonomy of learning skills could assist in the 
development of new, more refined test items to mi^ure learning skills relevant to math and sdvtacc. 

Perhaps the most conspicuous benefits of having a ^able taxonomy of learning skills would be 
realized in the burgeoning domain of intelligent computerized tutoring systems (FTSs). A number of 
such syst ems have been developed (Yazdani, 1986), and the potential for generalizing and synthesizmg 
results across the different syfitems fs seen as increasingly critical (Soloway & Uttman, 1986). Too 
often, researchers caught up in the cxdtement of developing powerful, innovative instructional systems 
have neither the mterest nor Ihe CJcpcrtisc for systematically evaluating tdosc systems. There have been 
a few small-scale evaluation studies of g!;^bal outcomes (e.g., Anderson, Boyle, & Reiser, 1985), but the 
field could obviously benefit irom an accepted taxonomy. System developers could state what kinds of 
learning skiUs were being developed, and evaluators could determine the degree of success achieved. 
In this way, a taxonomy could provide a useful metric by which to compare and evaluate tutors as to 
their relative effectiveness, not only in teaching the stipulated subject matter but also in promoting 
more general leamijig skills. 

The intelligcEt tutoring system context is a natiu al beneficiary of a learning taxonomy in a second 
way. Because of the pr edsioG with which instructional objectives may be stated, the degree of tutorial 
control over how these objectives guide instructional decisions, and the preosion with which student 
leanung may l:^ assessed, the ITS environment enables the examination of issues on the nature of 
learning that were simply not addressable in tiie past. Educational research has been plagued vvitii 
noisy data, due to tiic very nati.ire of field research and die inherent !ack of control over the way 
instructional treatments are administered and learning outcomes measured The controlled ITS 
environment thus offers new pi!omise as the ideal testbed for evaluating fundamental issues in learning. 
Witli rrSs, we now have the capability of generating rich descriptions of an individual learner's progress 



during instructioit A taxonomy should help w, determining exactly what indicators of learning progress 
and l&wuer status we ought to be producing and examining. So, a test of the utility of any learning 
taxonomy is whether it could be used to actually assist in such an endeavor. Our goal for this chapter is 
to propose such a taxonomy. We begin by looking at what has been done thus far. 

11. A TAXONOMY OF LEARNING TAXONOMIES 

Various approaches to the development of learning taxonomies have been employed. One way of « 
organidng these approaches, which we apply here, is by the categories of (a) designated/rational, 
based on a conditioos-oMeanung analysis; (b) empirical-correlational, based on an individual 
differences analysis; and (c) model-based, from formal computer simulations of learning processes. 

Designated/Rational Taxonomies 

Designated/rational taxonomies are by far the most common. Examples of this type are 
taxonomies proposed by Bloom (1956), Gagne (1965; 1985), Jensen (1967), and Melton (1964). 
Proposed taxonomies are based on a speculative, rational analysis of the domam, and frequently, the 
analysis applied is of a conditionsK)f-ieaming nature. That is, the proposer defines task categories in 
terms of characteristics that will foster or inhibit learning or performance. 

One of the first attempts to organize the varieties of learning was Melton's (1964) proposal of a 
simple taxonomy based primarily on dusters of tasks investigated by groups of researchers. The 
categories, roughly ordered by the complerlfy of the learning act, were conditionings rote learnings 
probability learning, skill learning, concept (earning, and problem solving. This general scheme was 
updated by Estes (1982), ^o examined conditions that fadlitated and inhibited these and related 
classes of learning, and looked for evidence of individual di£ferences in each class. 

A task-based scheme was also the basis for learning taxonomies proposed by Jensen (1967) and 
Gagne (1965; 1985). Jensen proposed a three-faceted taxonomy: a Leamingtype facet incorporated 
Meltorfs seven categories; di Procedures facet indicated variables such as the pacing of the task, stage of 



It joing, whether the task consisted of spaced or massed practice, and the like; and a Content/Modality 
facet indicated whether the task consisted of verbal, numerical, or spatial stimuli Jensen proposed that 
his taxonomy could be used as an aid in interpreting some research findings, such as why arbitrarily 
selected learning tasks du not intercorrelate \'ery highly (answen because they do not share any facet 
values). He hoped that his taxonomy would suggest a more systematic approach to selecting learning 
tasks for future studies, but there is not much evidence that researchers have subsequently followed his 
suggestions. 

Gagne's taxonomy (1965; 1985), on tlie other hand, has been widely taught and put to use in the 
area of instructional design (Gagne & Briggs, 1979). Gagne proposes five major categories of learned 
capabilities based on a rational analysis of common performance characteristics* Intellectual skills 
(procedural kno>^edge) reflect the ability to use rules; this capability in turn depends on the ability to 
make discriminations and to use concepts, and ttie rules themselves combine to form higher-order 
rules and procedures* Coffiitive strategies (executive control processes) reflect the ability to govern 
one's own learning and performance processes* Verbal infomtation reflects the ability to recall and use 
labels, facts, and Vkdiole bodies of knowledge. Motor-skills and Attitudes are two additional learned 
capabilities Gagne included to round out the list* 

These categories serve various purposes. They assist the investigator in defining and analyzing 
instructional objectives during task analysis, and later, in evaluating an instructional system to 
determine whether its objectives have been met For example, if the goal is to have the student acquire 
a conceptual skill, then the objective that the student be able to "discriminate" one thing from another 
maybe indicated* In the design phase, the categories suggest different approaches for delivering 
instruction^ since, according to Gagne, the five capabilities differ as to the conditions most favorable for 
their learmng. For example, with verbal information, order is not impoitant but pro^ading a 
meaningful context is; for motor skills, pro^ading intensive practice on part skills is critical 

All of these taxonomic systems-Gagne's in particular-are beneficial, but it is important to 
acknowledge their limitations. One problem inherent in the rational approach is the degree to which it 



is subject to imprecisH)n» which makes for communication difficulties and violates one of the mam 
motivations for developing the taxonomy in the first place. Without a strong model of learning 
requirements m a task, and without a foundation of empirical relationships, task analysis is still 
primarily an art rather ihan a technology. 

A second major problem with the rational approach was apparent to Melton (1964, 1967), who, in 
fact, argued that it should be abandoned The problem is that a taxonomic scheme based primarily on 
a rational analysis of task characteristics will only inddentally include actual psychological process 
dimensions. And presumably the process dimensions are what govern the most important aspect of the 
taxonomy: information regarding predicted task*to*task generality. Melton suggested that wfaiie the 
task*based approach might ^;e initially usehil, it was preferable ultimate^ to base the taxonomy on 
process characteristics rather than "a mish*mash of prccedural and topographic (Le., perceptual, motor, 
verbal, 'central') criteria" (p. 33l*). Although it was preliminary at that time to have actually suggested 
replacements to the t2sk*based categories, we will show later how cognitive science now provides 
suggestions for what they might be. 

Empirical'Correlational Taxonomies 

A second approach, less commonfy used in the domain of learning skills, has been primaiily 
empirical The history of individual differences research can be seen largely as an attempt to develop 
taxonomies of intelligence tests based on performance correlations (e.g., Thurstone, 1938), and there 
have been some attempts to develop ^milar taxonomies of learning tasks (e.g., Allison, 1960; Malmi, 
Underwood, & Carroll, 1979; Stake, 1961; Underwood, Bonich, & Makni, 1978). 

The empirical-correlational approach has one critical advantage over the rational approach as a 
means for taxonomy development: It directfy addresses the issue of the transferability of skills among 
tasks. That is, if we know that performance on learning task X is highly correlated with performance 

^It is historically interesting that it was at Melton's (1964) ciJnference that Fitts (1964) proposed a 
highly process*oriented taxonomy of psychomotor skills v^ch was only much later adapted by 
Anderson (1983) as the basis for a cogmtive learning theory. 
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on task then a natural proposal is that a high proportion of the skills task X requires are also 
required by task Y. Further, trainiag on task X should tiansfer at least somewhat to task Y. Thus, 
patterns of conelations among performances on learning tasks could, in prindple, be the basis for the 
construction of a taxonomy of learning skills. 

A very closely related idea-that mdividual differences mvestigations could serve as testbeds in 
constructing general theories of learning-was developed by Underwood (1975). His proposal was that 
i! a theory assumed some mechanism, and the mechanism could be measured m a context outside that 
in which it was mitially developed, then the viability of the mechanism could be tested by correlational 
analysis. 

These ideas were applied in an ambitious investigation that examined the intercorrelations among a 
wide variety of verbal memory tests (Underwood et al, 197S). The purpose was to determine whether 
theoretical notions developed in the general (nomothetic) learning literature, such as the idea that 
memories have imagina! and acoustic attributes, or that recognition processes are distinct from recall 
processes, could be verified with an individual differences analysis. 

The memory task stimuli were primarity words. In some tasks, words were randomly selected, but 
in others, words were chosen to elicit particular psychological processes. For example, concrete and 
abstract words were mixed, under the assumption that recall differences would reflect the degree of 
imagery involvement Words were embedded in various kinds of memory tasU (paired-assodates, free 
recall, serial learning, memory span, frequency judgment)* It was e3^>ected that dear word-attribute 
factors wouJd emerge, thus supportmg certain theoretical notions regarding properties of memory, but 
Underwood and colleagues discovered two somewhat unanticipated results* First, most of the variance 
was due to general mdividual differences m associative learning; ovSy a small percentage was due to any 
subject-by-task interadionu Second, the two factors that did emerge were not associated v^th word 
attributes, as migh<: have been e3q>ected, but mth type of task (free recall vs* paired-assodates and 
serial learning); but even this apparently was not a robust task division* A foUowup study (Malm! et al, 
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1979) found the same evidence for a general assodative-learaing factor, but the two extracted factors 
spKt tasks in a slightly different way (free-recall and serial learning vs. paired*assodates). 

What is the implication for a taxonomy of learning skills? Association formation rate apparently is 
a general, and perhaps fundamental, learning parameter* it may be that further subtle distinctions 
could be made among types of association formation, but the evidence m both these studies suggests 
little practical payoff m searching for such distinctions* 

Underwood and colleagues were primarily interested m memory per se; thus, their tasks 
represented a fairly narrow range of learning. A useful complement to their analysis would be a study 
that more systematical^ sampled learning tasks from something like Melton's or Gagne's taxonomy* In 
this regard, we consider a pair of studies by Allison (1960) and Stake (1961), v^o administered a 
diverse variety of learning tasks to large samples of Navy recruits and seventh*graders, respective^* 
Allison's learning tasks were four paired-assodates tasks (verbal, spatial, auditory, and haptic stimuli), 
four concept formation tasks (spatial and verbal stimuli), two mechanical assembly tasks consisting of a 
short study fihn followed by an assembly test, a maze tracing task; a standard rotary pursuit task, and a 
task that mvolved learning how to plot quickly on a polar coordinates grid* Stake's learning tasks were 
'^^tening comprehension (repeated study-test trials of the same story), free recall (words, numbers), 
paired-assodates (words, dot patterns, shapes, numbers), verbal concept formation, and maze learning. 
In both studies a variety of aptitude tests were also administered* 

The ori^nal anafyses of these data were somewhat problematic (see Cronbach & Snow, 1977), but 
a reanalysis conducted by Snow et aL (1984) using multidimensional scalmg (MDS) revealed a number 
of dimensions by which the learning tasks could be organized* First, in both studies, learning tasks 
varied systematically in complexity* This was mdicated by two findings; The learning tasks varied 
substantially (a) m the degree to which performance on them correlated ^th measures of general 
mtellectual ability, and (b) m how dose to the center of the multidimensional scaling configuration they 
appeared* Centrality reflects the average correlation of a test with other tests m the battery and may be 
taken as a measure of complenty (Marshalek, Lohman, & Snow, 1983; Tversky & Hutchins, 1986)* 

17 



Scow et aL suggested that the complexity relationship could be due either to some tasks 
subsuming others in terms of process requirements or to increased involvement of executive control 
processes such as goal monitoring. 

Second, in both analyses, there was evidence for a novel vs. familiar learning task dimension, which 
^ Snow et al. (1984) mterpreted as supporting the classic^ distinction between fluid and crystallized 

intelligence (Cattell, 1971), but v/hich might also be seen as supporting an inductive vs. rote learning 
distinction. In the Allison analysis, the paured-assodates tasks and some of the concept formation tasks 
appeared on one side of the scaling conflguration. The concept formation tasks so positioned were 
those which repeatedly used the same stimuli, thus enabling the successful use of a purely rote strategy. 
On the other hand, the assembly tasks and the novel plotting task, which required subjects to assemble 
a new solution procedure essentially from scratch, appeared on the opposite side of the configuration. 

The MDS analysis of the Stake (1961) data (learning rate scores) similarly suggested a 
fluid/inductive vs. crystallized/rote dimension. Listening comprehension, verbal paired-associates, and 
verba! free recall tasks appeared on the crystallized side of the configuration. The vert)al concept 
formation task-along with the spatial and number pattern paired-assodates tasks, which were partially 
amenable to an mductive learning strategy (response patterns could, but did not have to be induced)** 
fell on the fluid/inductive learning end 

The Snow et aL (1984) reanalysis thus provides a number of ideas that could facilitate taxonomy 
development In particular, it suggests task complenty and learning environment (inductive/novel vs. 
rote/familiar) dunensions* Does tLis suggest we ought to continue along these lines to develop a full 
taxonomy? Unfortunate^, we see two problems ^th the approach. One is simply practicality. 
Because of the tune and e)q)ense involved in coUectmg data on performance of learning tasks, which 
typically require majiy more subject hours than do other cognitive measures, there have not been the 
same kind of Iarge*scale empirical analyses of learning task batteries as there have been of intelligence 
test batteries (although data sets reviewed in Glaser (1967) and Cronbach & Snow (1977) could be 
reanalyzed along the lines of the Snow et aL approach* Even with the well-designed studies Snow et al. 
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reanalyzed, there is considerable under-determination of process dimensions, due to the fact that not 
enough varieties of Icarmng tasks were (or could have been) administered by Allison (1960) and Stake 
(1961). Thus, although the dmiensions revealed in the Snow et al. reanalysis are suggestwc, they 
certainly do not seem a sufficient basis for proposing a taxonomy of learning skills. It might take more 
like a few hundred diverse learning tasks to be able to see something that might serve as the basis for a 
true full-blown taxonomy. Obviously, such a study would be prohibitivefy expensive. 

A second problem with the empirical-correlational approach to taxonomy building is one inherent 
m a purely bottom-up approach to theory development. That is, on what basis should learning tasks be 
selected for mclusion in a to*be-analyzed battery in the first place? Factor-correlational structures or 
categories directly reflect the nature of the tasks included^m the analysis-and only these tasks; thus, the 
empirical approach is inherently analytic and, in some sense, conservative. Correlational analyses 
certainly may be useful for uiltial forays, or purely exploratory work, in suggesting underlying task 
relationships that might not have been anticipated at the outset But it cannot be complete m any 
sense* One cannot simply be sure to '*sample a broad range of tasks.** A sampling scheme for choosmg 
tasks already implies a taxonomy* Clearly, some means for generating ori^al taxonomic categories is 
required 

Information Processing Model-Based Taxonomies 

The two classes of learmng taxonomies thus far discussed have their roots m schools of thought** 
behaviorism in the case of rational taxonomies, psychometrics in the case of the empirical*correlational 
taxonomies-that are historically prior to modem cognitive psychology* One unfortimate side-efifect of 
the cognitive revolution had been a decline of interest in learning phenomena. Until the mid-1960s, 
when behaviorism was still largely predominant, learning issues held center stage. Mth the subsequent 
rise of cognitive psychology and the information processing perspective, theories of memory and 
performance came to dominate* Only recently has there been a rather sudden and dramatic upsurge of 
mterest in learning from an information processing perspective* Although many of the same issues 
remain, these second looks at learning through newer theories (e.g., Anderson, 1983; Rosenbloom & 
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Newell, 1986; Rumelhart & Norman, 1981) have resulted in a richer theoretical picture of learning 
phenomena. 



Corresponding to this rise of interest in learning, there have been proposals for model-based 
categories or taxonomies of learning types. These attempts differ from the empirically based individual 
^ diffe*'ences taxonomies in that they have not yet been completely validated, at least not as taxonomies 

of learning skills. However, we do see a correspondence between seme of the dimensions that have 
emerged in the individual differences analyses and some of the proposed learning mechanisms and 
categories, which we will point out as we go along. The model-based taxonomies differ also from the 
rational taxonomies in that they arise not simply from speculation and rational task analysis (although 
they certainly incorporate such methods) but from systematic information processing models of 
learning that have been demonstrated to be specified to a degree of precision sufficient for 
implementation as operational computer programs. Thus, taxonomies in this category are those 
investigations that have entailed the use of computer simulation of learning processes as a means of 
developing learning theory. 

One model-based taxonomy is suggested by Anderson's (1983) ACT* theory. The theory proposes 
two fundamental torms of knowledge. Procedural knowledge (knowledge how) is represented in the 
form of a production system, a sett of if-then rules presumed to control the flow of thought. Declarative 
knowledge (kno^^edge that) is represented in the form of a node-link network of propositions, which 
are presumed to embody the content of thought 

The ACT* theory in its most recent formulation (Anderson, 1983; 1987a) specifies three basic types 
of learning: one to accommodate declarative (fact) learning, one spedfic to procedural learning, and 
one applicable to both types. Learning in declarative memory is accomplished solely by the 
probabilistic transfer to long-term memory of any new proposition (that is, a set of related nodes and 
links) that happens to be active in working memory. It is worth noting that Underwood et aL's (1978) 
finding of a broad and general associative learning factor lends empirical support to Anderson's claim 
for a single declarative learning mechanism. 
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A secoiid learning mechanism^ knowkdge compilation^ accounts for procedural learning. 
Knov^cdge compilation actually consists of two related processes. Learning by composition is the 
collapsing of sequentially applied productions into one larger production* This corresponds to the 
transition from step-by-step execution of some skill to "one-pass** or all-at-once execution. Learning by 
proceduraiization is a related process in v/Iiich a production becomes specialized for use in a particular 



task. This corresponds to the transition from the use of general problem-solving skills on novel 
problems to the employment of specialized, task-specific skills tuned to the particular problem at hand. 
Anderson's third learning mechanism^ strengtheningf operates somewhat analogously to the traditional 
learning principle of reinforcement. Both facts and procedures are presumed to get stronger, and 
hence more easily and more reliably retrieved, as a function of repeated practice. 

To appreciate Anderson's theory, it is important to note that it models the dynamics of skill 
transition, and is not simply a list of the different ways in learning can occur or a categorization 
of learning tasks. The basic idea is that upon initial exposure to novel material, such as a geometry or 
computer programming lesson, the learner first engages in declarative learning, forming traces of the 
various ideas presented. Then, when given problems to solve later in the lesson, the learner employs 
very general methods such as analogy, random search, or means-ends analysis, which operate on the 
dedarath^e traces to achieve solution. Employing these very general methods is cognitively taxing m 
that it severely strains working memory (to keep track of goals and the relevant traces), and thus initial 
problem solving is slow and halting. But portions of the process of using these general methods and 
achieving particular outcomes (some of which actually lead closer to solution) are automatically 
compiled while they are being executed. This is the proc ^dm^l learning component. The learner 
essentially remembers the sequence of steps associated with solving a particular problem, or at least 
parts of the problem. Then when confronted with the problem again at some point in the future, the 
learner can simply recall that sequence from memory, rather than have to rethink the steps from 
scratch. V'*th practice on similar problems, the compiled procedure is strengfheneis which produces 
more reliable and faster problem solving. With continued practice, the skill ultimately is automatized^ 



12 



ERIC 




in Attt It becomes possible to execute ilie skill without conscious awareness and without drawing on 
working memory resourcer>. 

Again, there may be a correspondence between an empirically based individual difference 
dimension and a distinction implicit in the model-based taxonomy. Snow et aL*s novel learning tasks, 
presumed to tap fluid intelligence, may be likened to Anderson's novel learning situations, which 
presumably tap very general problem-solving skill. On the other side. Snow et al.*s familiar learning 
tasks, which call on crystallized skills, can be characterized in ACT* terms as engaging the declarative 
learning mechanism or mvolving the retrieval of already-compiled procedures. It is noteworthy that 
despite rather major differences m methodology inherent in the indiwdual differences vs. model-based 
approaches, there is some convergence in the categories of learning skills. Although Anderson (1983; 
1987) views the emergence of the leammg dimension as the result of the transition of slg^H, rather than 
perhaps as an array of fundamentally different kinds of learning task,s, there is a basic o-i^-^patibility 
between the conclusions of the research approaches* 

A second approach to building a model-based taxonomy is based on an integration of the literature 
from the Artificial Intelligence subspecialty of machine learning. Taxonomies of research in ffiachme 
learning (Carbonell, Michalski, & Mitchell, 1983; Langley, 1986; Michalski, 1986; ScU, 1986) have been 
proposed, and there even exists something or a consensus in the field regarding the categories in the 
taxonomy. 

One dimension of machine learning research particularly relevant to ou concerns here is learning 

strategy^ which Michalski (1986) defines as the type of inference employed during leammg, a£.d which he 

characterizes as follows: 

In every learning situation, the learner transforms information provided by a teacher (or 
environment) into some new form in which it is stored for future use. The nature of this 
transformation detemunes the type of learning strategy usedM..T>iese stjrategies are 
ordered by the increasing complexity of the tran^ormation (inference) f/om the 
information initialty provided to the knowlulge ultimately retjiurcA Their order thus 
reflects mcreasing effort on the part of the student and correspondingly decreasing effort 
on the part of the teacher, (p. 14) 
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It is mteresdng that the dassifica^ioxi of machine learning research yields such a mcc process 
classification and thereby seems promising as a realization of Melton's ultimate hopes for a taxonomy 
of learning. The kinds of inferecdng strategies Carbonell et aL and Michalski suggest are listed in 
Table 1. (We have added an additional category, Learning by Drill & Practice, to the list, because we 
use the list as the basis for one of our proposed taxonomy categories, and it is convenient to denote 
that here.) Note that wlule there may be some similarity between Carbonell et al. and Michalski*s 
categories and those proposed by Melton, Gagne, and others, the basic difference is the fact that m the 
Carbonell-Michalski system, the underlying motivation for distinctions is necessarily the existence of 
differences in cognitive processing requirements. We will return to a more thorough discussion of 
these categories m the next section^ 

We believe that Anderson's (1983) and Carbonell-Michalski's (1983; 1986) model-based attempts to 
propose varieties of learning represent an advance beyond either the rational or empirically based 
taxonomies and go a long way toward abating some of the most severe critidsms of earlier taxonomies. 
Yet all three approaches ^eld ideas on the varieties of learning skills that might be fruitfully 
synthesized. The remainder of this paper represents our initial attempt to integrate these ideas* 

III. A PROPOSED TAXONOMY OF LEARNING 

Thus far we have discussed why a taxonomy of learning is important, and what others have done in 
the way of proposing taxonomies. Our goal for this section of the paper is to propose a taxonomy 
based on a synthesis of some of the ideas just reviewed, ^th an eye toward two major objectives. First, 
the taxonomy should be useful as a learning task analysis system. That is, it should be useful in 
answering questions like: What are the component skills involved in learning to disassemble a jet 
en^e, or operate a camera, or program a computer, or make economic forecasts? Second, the 
taxonomy should serve to focus our research. Spedfying the ways people learn may sviggcsX where we 
ought to be expending more research energy. We do not see this as dictating research directions, as 
some critics of psychological taxonomies have suggested (MartiU) 1986), but as suggesting potentially 
high-payoff research directions. For example, we already know much about declarative learning, such 
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Table 1. Learning Strategies From a Taxonomy of Machine Learning Research 



Rote Learning: Learning by direct memorization of facts without generalization. 

Learning from Instruction: The process of transforming and mtegrating instructions from an external 
source (such as a teacher) mto an mtemally usable form. 

Learning by Deduction 

Knowicdgc Compi tionrTransIating knowledge from a declarative form that caunot be used 

directly mto an effective ptocedural form; for example, converting the advice ''Don't get wet" 
into specific instructions thhl recommend how to avoid getting wet in a given situation. 

Caching: Storing the answer to frequently occurring questions (problems) in order to avoid a 
replication of past efforts. 

Chunking: Groupmg lower-level descriptions (patterns, operators, goals) into higher-level 
descriptions. 

Creating Macro^Operators (Composition): An operator composed of a sequence of more 

primitive operators. Appropriate macro-operators can simplify problem solving by allowing 
a more "coarse-grained* prd}Iem-solmg search. 

" earning by Drill and Practice: Refining or tuning knowledge (or skill) by repeatedly using it in 
various contexts, allowing it to strengthen and become more reliable through generalization and 
specialization^ 

Inductive Learning: Learning by drawing mductive mferences from facts and observations obtained 
from a teacher or an environment. 

Lining by Analogy: Mapping mformation from a known object or process to a less know but 
similar one. 

Learning from Examples: Inferring a general Concept Description from examples and (optionally) 
counterexamples of that concept 

Learning from Observatioii & Discovery: Constructing descriptions, hypotheses, or theories about a 
given coUecUon of facts or observations. In this form of learning there is no a priori classiilcition 
of observations into sets exemplifying desired concepts. 



Note. AU categories except Deductive Learning (Michalski, 1988) arc from Carboaell ct aL (1983). 
The definitions are taken from the glossary m Michalski, Carbonell, and Mitchell (1986)» Leamiri^jby 
Drill and Practice was not a category mcluded in these sources, but we included it m the taxonomy and 
thus, for economy, we describe it here. 
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as what kinds of individual differences to expect and its relation to other cognitive skills. Wc know 



which research attention may productively be focused 

Wc have selected four dimensions, illustrated in Figure 1, as particularly important m classifying 
learning skills. The two dimensions shown in Hgure la-knowledge type and instructional 
environment-are motivated primarily by our discussion of the Anderson and Carbonell-Michalski 
systems, respectively, although Gagae*s ideas on learned capabilities served to broaden the range of 
categories included in knowledge type. The crossing of these two dimensions (Figure la) defines a 
space of general learning tasks. 

The motivation for the other two dimensions, illustrated in Figures lb and Ic-Jomcw and learning style- 
became apparent when we began examining applications of the taxonomy, which we discuss in the next 
section of the paper. Hgure lb illustrates a hypothetical domdn-space as the aos&iig of the degree of 
quantitativeness and the importance of quality vs. speed in decision making. Th<; idea is that any 
domain can be located in such a space, and that the set of learning skills deflned by the first two 
taxonomy dimeosions (Figure la) may prove to be empirically distinct from parallel learning skills in 
other domains. We represent this idea m Hgure lb by scattering kno^^edge type by mstructiooal 
environment matrices over the domain space, for various occupational-training domains. The two 
dimensions portrayed in the domain space are only suggestive, and are meant only to express how 
domain interacts with the first two taxonomy dimensions. Finally, Figure Ic lists a variety of possible 
learning styles, \^ch, we proposi^ must be considered in conjunction vnth the first three taxonomy 
dimensions in determining what skills are being tapped by a particular learning task. 



considerably less about procedural learning skL'\ The taxonomy may pinpoL. other learning skills on 



Kno)^ledge Type 



The declarative*procedural distinction is fundamental Further refinements are possible; 



declarative knowledge can be arrayed by complexity, from propositionai kno^edge to schemata 



(packets of related propositions). Similarly, procedural knowledge can be arrayed from simple 
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holistic processing 
active/impulsive orientation 
systematic/punctilious approach 
theory-driven (top-down) 
spatial representation 
deep processing 
low internal motivation 



• Economtti 
(SmitMown) 



{Wi ItSP Tvteri 



SIOWPROCESSJW 



serial processing 
passive/reflective orientation 
haphazard/exploratory approach 
data-driven (be om-up) 
verbal representt^Son 
supcrfidal processk^ 
high internal motivation 



Figure L L^aining skills taxonomy; a) Enviromncnt by knowledge-type mal^ cell entries would be 
various learning tasks; b) Environment by knowdedgc-typc matrices plotted in a hypothet'cal two- 
dimensional domain-space: proximal matrices should show relatively greater transfer among 
parallel learning skills; c) Suggested learning styles that might interact with other taxonomy 
dimensions in determining what learning skill a particular learning task measures. 
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productions, to skills (packets of productions that go together), to automatic skills (skills executed with 
minimal cognitive attention). Productions and skills can also be arrayed by generality, from a narrow 
(specific) id a broad (general) range of applicability. A final knowledge type is the mental model, 
which requires the concerted exercise of multiple skills applied to elaborate schemata* Knowledge 
types are dynamically linked: Acquisition of a set of propositions may be prerequisite to acquisition of 
a related schema, or to a procedural skill; both in tiun may be prerequisite to acquisition of some 
mental model. 

In cognitive science circles, the dedarative-procedural distinction is sometimes said to be formally 
problematic m that declarative kno\^edge can be mimicked by procedures (^^ograd, 1975). One can 
dedaratively know that '*Washmgton was the first president"; altematively» one can have the procedure 
to respond "Washington" when asked "Who was the first president?" We finesse the problem here by 
keeping close to an operational definition of knowledge type: We define knowledge in terms of how it 
is tested. Declarative knowledge can be probed with a fact recognition test (sentence recognition, word 
matching, etc.), or m the case of schemata, with clustering and sorting tasks (e.g^ Chi, Feltovich, & 
Glaser, 1981). Procedural knoi^dedge requires a demonstration of the ability to apply the knowledge to 
predict the output of some operator (operator tracing) or to geneirate a set of operators to yield some 
output pattern (operator selection). Possession of skills and automatic procedures may be 
operationally determmed by examining the degree of perfonnance deaement under imposition of 
secondary ta<:^ks (Wickens, Sandry, & "^dulich, 1983) or through other methods of increasing 
processing demands (Schneider & Shiffirin, 1977; Shiflfrin & Schneider, 1977; Spelke, Hirst, & Neisser, 
1976). Possession of an appropriate mental model might require testing performance on a complex 
simulation of some target task An illustrative (not e^diaustive) list of tests for the various knowledge 
types is given in Table Z 

Instructional Environment 

Instruction delivered in a classroom setting or even on a computer vdll inevitably provide the 
student with opportunities to incorporate the material in multiple ways. Real instruction occurs in a 



Table 2, Sample Tests for the Various Knowledge Types (from the Domain of Logic 
Gate Circuits) 



Knowledge Type 



Type of Test 



S^unple Item 



Propositioii 



Schema 



Rule 



General Rule 



Skill 



Geaeral Skill 



Automatic Skill 



Mental Model 



Sentence Verification "AND yields High if all inputs are high, Low 

otherwsc-True or False?" 
Stimulus Matching "AND D-^Match or Mismatch?" 
Paired-associates "TVhich symbol is associated with AND?" 

Free Recall (components) 'What arc the different types of logc gates?" 



Free Recall (structure) 
Sorting 
Clasdficadon 
Sentence Completion/ 

Ooze 
Lcdcal Decision 

Operator Tracing 

Operator Selection 

Transfer-of'Training 



Multiple Operator 
Tracing/Selection 



Transfer-of-Training 

Dual-task 
Complexity^increase 



Process Outcome 
Prediction 



"Reproduce the drcuits you just studied" 

"Sort the drcuits into categories" 

"Pair drcuit diagrams with these devices" 

"AND yields — if all — arc — ." 
"XAND is a legal logic gate-True or False?" 

-Determine output of lo^c gate 

(AND, HIGH, LOW) = ? 
-Choose an operator to achie\'e a result 

(?,HIGH,LOW) = HIGH 

-Learn and be tested on other kinds of lo^cal 
relations such as those mtroduced m symbolic lo^c 



-Trace through (or select) a series of linked logiic 
gates in a drcuit (could also use hierarchical menus 
methodology) 

-Learn and be tested on constructing or verifying 
lo^cal proofs 

-Trace lo^c gates while monitoring a secondary 
signal 

-Trace logic gates that become increasingly 
complex 



-Troubleshoot a Simulated Target Task; Walk- 
Through Performance Test 
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diverse environme&t from the standpoint of student vs* teacher control and consequently in the kinds of 
inferences students are required to make. Even in the lecture environment, students may engage a 
variety of inferecdng strate^es* Nevertheless, it is useful to differentiate instructional environments in 
a local sense: It should be possible to tag a spedSc instruction segment as to the form in which it is 
delivered and the kinds of inference processes or learning strate^es it is likely to invoke. Following 
Carbonell et and Michalski (I'able 1), we propose to characterize local instructional environments 
according to the amount of student control in the learning process* At one end, rote learning (eg., 
memoriang the times table) involves fiiB teacher control, little student control Didactic learning (by 
textbook or lecture), learning by doing through practice and knowledge compilation, learning by 
analogy, learning from examples, and learning by cbservatica and discovery, ofifer successively more 
student control, and less teacher control 

Note that we modify the Carbonell-Michalski list slightly by comlnning their learning by deduction 
(compilation) category with a learning by refinement category (suggested to us by W. Re^an, personal 
communication, May 4, 1987). What we are pinpointing is the ability to refine one's skill (by 
strengthening, generalization, and discrimination) based on feedback following performance. Before 
one is engaged in this kind of learning; we assume the sidll has alreacfy been acquired (perhaps in a 
rote fasMon) and compili?^ and is now at the phase of being refined* But because compilation and 
refinement are probably hopelessly intertwmed in actual learning contexts, wc combine them into a 
single leaming-by-doing (Practice environment) category. 

Domain (Subject Afatter) 

The inclusion of subject matter as a taxonomy dimension reflects the fact that much of learning has 
a strong domain-specific character. One can be an e:q>ert learner in one domain and a poor learner in 
another. Certainly there is some generality in learning skills over domains* Glaser, Lesgold, and 
Lajoie (in press) suggested that metacognitlve skills might be fairly generalized But even here, there is 
not much evidence that metacognitive skill in mathematics (Schoenfeld, 198S) predicts metacogniUve 
skill in writing (Hayes & Flower, 1980). 



It is appropriate to ask the question of the topic range over \^ch some general learning skill is 
likely to be useful It may be that the degree to which a sutject matter taps quantitative or technical 
knowledge, and the degree to which it taps verbal knowledge, captures some of the transfer relations 
among academic subjects. The degree of sodal involvement may also play a role, espedally when one 
considers the universe of occupational training courses rather than simpfy academic training. As is 
suggested in Figure lb, it may be that the relative importance of speed vs. quali^ in decision-making 
may be a critical domain dimension. But again, the dimensions portrayed in Hgure lb are only meant 
to be suggestive. 

More generally, we enr^on a complete domain-space. The underling dimensionality of such a 
space could be discovered through a f;t\xdy of the similarity (either judged or as shown in transfer of 
performance relations) among all jobs, courses, or learning e^riences in any spedfiable universe of 
mterest, and could be represented as a multidimensional scaling of the jobs or courses so rated. An 
empirically determined dcmain-space would spedfy the likelihood that (or the degree to which) a 
particular taxonomic skill, deGned by the environment and the knowledge type, would transfer to or be 
predictive of a parallel skill (Le., one defined by the same environment and knowledge type) in another 
domain. Proximal domains, in the multidimensional space, would ^eld high transfer among parallel 
skills; distal domains might yield only minimal transfer. For example, assuming the importance of the 
quantitative dimension, skill in learning mathematics propositions through didactic instruction might 
predict skill in learning physics propositions through instruction; but neither may be related to the 
ability to learn history propositions through instruction. 



All sorts of subject characteristics-aptitudes, personally traits, background experiences-aSect 
what is learned in an instructional setting. But we focus on characteristics of the learner's preferred 
mode of processing or learning style, because our primary concern is characteristics over which the 
instructional designer may exercise control Because style implies a choice by subjects as to how to 
orient themselves toward the learning experience, it should be manipulable through instruction. 



Learning Style 
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A considerable literature on cognitive style exists (Messick, 1986)* Among those that have received 
the most attention are field dependence-independence (Goodenough, 1976) and cognitwe complexity 
(Unville, 1982), but these are now presumed primarily to reflect ability (e.g^ Cronbach & Snow, 1977; 
Linn & Kyllonen, 1981)* Impulavity-reflectivity (Baron, Bad^o, & Gasldns, 1986; Meichenbaum, 
1977) more dearly fits our criteria for mdusion m the taxonomy, in that it is malleable: Subjects can be 
trained to be more reflective m problem solving, and this improves performance* Other styles we 
consider in our analyses of learning environments are holistic vs. serial processing, activity Ievel» 
systematidty and e:q>loratoriness, theory*driven vs. data-driven approaches, spatial vs* verbal 
representation of relations (Perrig & Kintsch, 1984), superficial vs. deep procesdng, and low vs* high 
internal motivation* Some dimensions may affect learning outcomes quantitatively: Active students 
may learn more. Others may affect outcomes qualitatively: Spatial vs* verbal representations will result 
in different reladonships learned 

Cognitive style may mteract mth other taxonomy dimensions in determining v^t learning skill is 
being tapped in instruction* A stady by Pask and Scott (1972), which identified holist vs* serialist 
processing styles, can illustrate this mteraction* In this study, serialists, ^i^o focus on low*order 
relations and remember information in lists, were contrasted vnth hulists, v^o focus on high-order 
relations and remember the overall organization among items to be learned Pask and Scott showed 
that presentmg a learning task ^e., learning an artificial taxonomic structure) in a way that matched 
the learner's style resulted in better overall learning. A critical pobt for this discussion is that the 
presentation of material should tap different skills for subjects who differ on this st^e dimension* 
Presenting a long list of prindples may be a difficult memory task for serialists, who attempt to 
memorize each relationship presented For holists, the same task may tap conceptual reorganization 
skill rather than memorization skill 

Summary 

The first three dimensions of the taxonomy define a space of learmng tasks (Figure la set m the 
domain-space of Figure lb)* Each cell represents a task that teaches a particular subject matter (e.g.. 




physics priadples: Newton's second law), by a particular means (e.g^ by analogy), resulting in a 
particular kind of knowledge outcome (e.g., a schema). A particular taxonomic learning sidll then may 
be defined by performance on a particular taxonomic learning task. There vAU be mteractions among 
dimensions: Some subject matters lend themselves more readily to certain kinds of knowledge 
outcomes. For example, propositions are emphasized m non-quantitative fields; procedures are the 
focus in quantitative fields. And knowledge outcomes covary with instructional method; we more 
commonly learn propositions than procedures by rote. 

As an illustration of some of these ideas, consider the instructional goal, extracted from a 
programming text, of teaclung the concept of electric field (Glynn, Britton, Semrud-Clikeman, & Muth, 
m press). A rote approach might be to have students simply memorize the definition: "an electric field 
is a kind of aura that extends through space." A didactic approach might specify that students read the 
definition embedded m the context of a larger lesson, then to have the student demonstrate 
understanding by having him or her paraphrase the definition. The difference between the two 
approaches could be reflected in the way in which the knowledge was tested. The appropriate rote test 
would be verbatim recognition or recall; the appropriate instruction test would be paraphrase 
recognition or recall^ 

The electric field concept could be instructed by having students practice using it; follov^ong a 
discussion of properties of force, such as how an electrical force holds an electron m orbit aroimd a 
proton, students would be given an opportunity to solve problems that made use of the concept One 
could also lead students to induce the concept, by pointing out how it is analogous to a gravitational 
field, by providmg them with examples and counterexamples, or by having them discover it with a 
simulator or m a laboratory. 

Unlike the first three dimensions, the fourth dimension-learning st^e-refers to characteristics of 
the person rather than the environment Inclusion of the learning style dimension is an admission that 

"^Interestingly, test*question type has been shown to detennme a learner's subsequent processmg 
strategy (Fredericksen, 1984; Sagerman & Mayer, 1987). 
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providing a particular kind of environment guarantees neither the kind of learning e?qperience that will 
result nor the kind of learning skill being tapped. Person characteristic by instructional treatment 
interactions exist (Cronbach & Snow, 1977, especially Chapter 11); thus, as we tried to illustrate in the 
example on holist vs. serialist processing, the style engaged at the time of learning and testing will 
partly determine what learning skill is being measured 



Our goal for this section of the paper is to consider how the learning taxonomy might facilitate the 
development of indicators of learning skill in actual practice. We consider this a kind of test run for the 
taxonomy* We have proposed a taxonomy; it is now appropriate to demonstrate how it might be 
applied V/e discuss three computerized instructional programs, each of \^ch includes some capability 
for determining \diat and how students are learning* We suggest ways in vMc3i additional learning 
indicators might be generated in light of our taxonomy. 

We see the taxonomy playing two roles here. One, though not the focus of the paper, is to help us 
classify instructional programs* By our taxonomy, similar programs are ones that teach the same type 
of knowledge (propositions, skills, etc), provide the same instructional environment (rote, discovery, 
etc), teach the same domain material (computer programming, economics, etc), and encourage the 
same kind (st^e) of learner interaction (reflectivity, holistic processing, etc)* Programs are dissimilar 
to the degree that they mismatch on these dimensions* An important part of our discus^on of the 
three tutoring systems then is to indicate at least informally vfhai learning skills are being exercised, 
and to what degree* 



The second and (for current purposes) more important role for the taxonomy is to assist us in 
thinking more broadly about learning skills and outcomes* The taxonomy with its spedfled methods 
and tests, cac ;;;inpoint what potentially important learning events are simply not being measured by 
existing instructional programs. We can imagine generating alternative instructional programs by 
varying the degree to which different kinds of learning skills 'ire exercised 



IV. APPLYING THE TAXONOMY: THREE CASE STUDIES 
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The three programs we discuss in this section are intelligent tutoring systems, and so we be^ by 
providing a few prelimmary remarks on their general organization. 

General Comments on Intelligent Tutoring Systems 

Hgurc 2 illustrates the components of a hypothetical and somewhat generic intelligent tutoring 
system. In this syptem, the student learns by solving problems, and a key system task is to generate or 
select problems that will serve as good learning experiences* 

The system begins by considering what the student akeady knows (the STUDENT N!ODEL), what 
the student needs to know (the CURRICULUM), and what curriculum element (lesson or skill) ought 
to be instructed next (the TEACHING STRATEGY). From these considerations the system selects 
(or generates) a problem, then either works out a solution V' the problem (with its DOMAIN 
EXPERT) or simply retrieves a prepared solution* The program then compares its solution to one the 
student has prepared, and performs a diagnosis based on the di£ferences between the solutions. 

The program provides feedback, based on STUDENT ADVISOR considerations such as how long 
it has been since feedback was last provided, whether the student was already giveu a particular bit of 
advice before, and so forth. After this, the program both updates the student skills model (a record of 
what the student knows and does not know) and increments learning progress index counters. These 
updating activities modify the STUDENT MODEL, and the entire cydc is repeated, starting with 
selecting (or generating) a new problem. 

Not all ITSs include all these components, and the problem^test-feedback cyd3 does not adequately 
characterize all systems* But this system fairly describes many coasting ITSs and perhaps most 
interactions with human tutors* Thus, an examination of the components of the generic tutor should 
yield some ideas on how learning progress and 'Se current status of the learner may be indicated Note 
that much of this information is contained in the dynamic student model We now discuss three 
instantiations of this generic tutor.. 
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GENERATE-A-PROBLEM 
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Figure 2. Components of a generic intelligent tutorisg system. (Boxes represent decisions the program 
makes; ellipses represent knowieidge bases the program consults.) 
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(1) BIP: Tutoring Basic Programming 
General System Description 

The Basic Instruction Program (BIP) was developed at Stanford University's Institute for 
Mathematical Studies in the Social Sciences and was one of the first operational intelligent tutoring 
systems (Barr, Beard, & Atkinson, 1976; Wescourt, Beard, Gould, & Barr, 1977).'* BI? teaches 
students how to write programs m the language BASIC, by having the student solve problems of 
mcreasmg difiBculty. The system selects problems according to what the student already knows (based 
on past performance), which skills it believes ought to be taught next, and its undcrstandmg of the sldlls 
required by the problems in its problem bank. 

BIFs architecture is consistent witl^ the generic tuton BIFs Curriculum Informadon Network 
represents all the skills to be taught and the relations among th^m. Skills are represented quite 
narrowly; for example, "initialize a counter variable" or "print a literal string." The relations specify 
whether skills arc analogous to other skills, wite'dier they are easier or harder or at the same difficulty 
level as other skills, and whether there are any prerequisite skills. As an example, printing a numeric 
literal (or constant) is considered conceptually analogous to, but also easier than, printing a string 
literal; both are considered easier than printing a numeric variable; and printmg a numeric literal Ls 
considered a prerequisite to printing the sum of two ni»%i!>crs, 

A programming task is represented m terms of its component skill requirements. For example, a 
BIP task might ask the student to compute and print out the number of gifts sent on the 12th day of 
Christmas, given that: On the first day 1 pft was sent; on the second day 1 + 2 gifts were scn^• op he 
third day, 1 + 2 + 3 were sent; and so otl The student is expected to write a program that computes 
the sum of 1 + 2 + ... + 12* Based on a task analysis conducted by BIFs authors, BIP knows that the 
component skills required for solving this particular problem are initialize numeric variable^ use for next 
loop with literal as final value, and so forth* Each task is assumed to tap a number of skiik 

Harr et aL developed BIP-I; Wescourt ct aL developed its successor BIP-II. The two systems are 
fairly sunilar, but we assume the newer system where there are discrepancies. 
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BIPs student model is a list of the student's status with respect to each of 93 skills m the 
curriculum. There are five discrete status levels: UNSEEN (student has not yet seen a problem that 
required the skill), TROUBLE (student has seen but has not solved a problem that required the skill), 
MARGINAL (student has learned to a mar^nal degree), EASY (student has not yet se-^n but problem ' 
requires an easy skill to learn), and LEARNED (student has learned to a sufSdent degree). After 
each problem, skill status is updated as a result of the student's self-evaluation and through two 
DOMAIN-EXPERT-like components to BIP: a BASIC mterprcter which catches syntax enors, and a 
solution evaluator wWch determines whether the program is producing correct outputs. Fmally, BIP 
also provides a number of aids to the student. The student may request help, a model solution in 
flowchart form, or a series of partial hints. 

BIP selects problems by first identifying skills for which the student is ready (ones that do not have 
any unlearned prerequisites) but that need work, which means (m order of priority) (a) skills which the 
student has found difficult (i.e^ from tasks not completed), (b) skills analogous to LEARNED skills, or 
(c) skills postrequisite to LEARNED skills. Skills so identified are called NEEDED skills. BIP then 
identifies a task with NEEDED skills but no unlearned prerequisites. 

If the student successfully solves the selected task, BIP updates the student model by aediting the 
associated task skills. If the student fails the problem or ^ves up (i.e., requests a new task), BIP 
determines which skills to blame, accordmg to criteria such as the student's self-evaluation, whether the 
student aheady LEARNED some of the skills or analogous ones, and whether any task skills or 
analogous ones are in an unlearned state. 

There are a number of ways m which aptitude information guides problem selection. For the fast 
learner, if two skills are linked by difficulty (one is harder than the other), the system assumes that the 
easier one is not a NEEDED s5dll; BIP also will select tasks with multiple NEEDED skills. If the 
student is consistently having trouble, BIP opts for a slow-moving approach and minimizes the number 
of NEEDED skills mtroduced m a single task« 
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learning Indicators 



Snow, Wcscourt, and Collins (1986) collected aptitude and other personal data from 29 subjects 
who had used BIP, and performed a number of analyses on the relationships among those data and 
BIP variables. Table 3 shows the list of learning indicators used by Snow et al. We have divided the 
list into three categories: learmng progress indices, learning activity variables, and time allocation 
variables. 

The sample was too small to draw definitive conclusions about relationships, but there were some 
suggestive findings worthy of further pursuit, first, the best learning progress mdex seemed to be the 
slope of the number of skills acquired over the number of skills possible (skills slope). Determination 
of best is based on two considerations: Skills slope was most representative of other learning progress 
indices in that it had higher average intercorrelations with those mdices (ccntrality), and it had higher 
average correlations with the learning activity variables (a validity of sorts). Particularly intriguing was 
that skills slope, along with a global achievement posttest, was mere highly related to the activity 
variables than was the raw number of skills acquired Snow et aL (1986) suggested this may have been 
due to the skills slope's capturing more about the progress of leammg over time* 

The second major finding concerned the role of the activity variables in predicting learning 
outcome. As it turned out, most of the tool usage indicators, such as requests for demonstrations, 
hints, and model solutions, were associated with poor posttest performance. Poor performers also 
spent more t^xae debugging and less time planning than did others, and were more likely to quit the 
task or start over. In contrast, good performers requested fewer hints, spent more time implementing 
rather than debugging, and were more likely to test different cases after a successful run of their 
program (Indicator 15). This may have reflected good students* desire to perform additional tests of 
their kno^edge, perhaps to probe the boundaries of their understanding, even after passing the test. 
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Table 3, Learning Indicators from BIP, the Programming Tutor 



LEARNING PROGRESS INDICES 

1. Number of problems seen 

2. Mean time per problem 

3. Number of skills acquired 

4. Skills acquired per problem (slope, intercept, standard error) 

5. Skills acquired per time on task (slope, intercept, standard error) 

6. Skills acquired per skills possible (slope, intercept, standard error) 

LEARNING ACTIVITY VARUBLES 

(Counts of activities^ to be divided by number of problems seen) 

1. StudSxJ produces correct solution 

Z Student has difficulty on the task (according to BEP) 

3. Student admits not understanding the task 

4. Student disagrees \^th solution evaluator 

5. Student requests solution model 

6. Student requests solution flow chart 

7. Student requests model program 

8. Student starts problem over 

9. Student requests at least 1 hint before starting 

10. Student requests at least 1 but not all hints 

11. Student requests all hmts (0 - 5 on a problem) 

12. Student quits the problem, 

13. Student quits the problem after seeing all the hints 

14. Student quits the problem ^thout seebg any hints 

15. Student tests different input cases after successful solution 

16. Student tests different input cases after failed solution 

17. Student uses BIP input data after failed solution 

18. Student runs program parts rather than complete program 

19. Student requests aid (model, help, hint) after an error 

TIME ALLOCATION 

1. Planning: Proportion of time spent before codmg 
Z Implementing: Proportion of tune spent writing code 
3. Debugging: Proportion of lime spent debugging code 



Note: Time on the tutor must fall into one and only one of the three time allocation portions. 



30 

Q 



ERIC 



Appfyingthe Taxonomy 

In evaluating the BDP tutor with respect to the taxonomy, we ask two questions: (a) What learning 
skills does BIP exercise (i.e., how can BIP be classified)? and (b) How comprehensive arc the 
indicators used by Wcscourt et al. (1977) and Snow et al. (1986) in measuring students' learning skills 
and their learning progress? 

To address the first question, consider a distinction between what is tested and what is taught. BIP 
primarily tests for fairly specific skills, in that virtually all its tests ar« of the multiple operator selection 
variety (i.e., students write programs). The posttest also undouUedly taps some propositional, 
schematic knowledge, but not extensively. Other knowledge outcomes could be tested, but they are 
not. BIP teaches skills by having students first read a text (Learning firom Instruction, in taxonomy 
terminology), then apply the studied skills in a problem-solving context (Learning through Q3mpilation 
and Learning by Drill & Practice). Some students also request help and thereby engage in Learning 
from Examples. The good students also tend to invoke Observational Learning when they perform 
additional tests of their programs* 

Rgure 3a summarizes our assessment cf (a) what skills arc being exerdsed by BIP, indicated as the 
solid bar, and (b) what skills are being tested, indicated as the striped bar. Bar size represents the 
proportion of time spent either cnga^ng the learning skill (solid) or having the skill tested (striped), 
relative to engaging or testing other skills. It is important to keep in mind that this analysis is rather 
informal Wc made some rough computations of the times students engaged in the various activities, 
based on a review of Snow et al.'s (1986) data on the learmng indicators, and on Wcscourt et al.*s 
(1977) report of some other summary statistics. Our analysis is meant to be merely suggestive. A 
more rigorous, systematic analysis of BIP could produce a precise breakdown of the time spent 
exercising and testing various learning skills, separately for each student Also note that only the 
knowledge type and instructional environment dimensions arc indicated in Figure 3. Domain is 
indicated in Figure lb { — nputcr programming is highly quantitative/technical and quality of decisions 
is emphasized). Learning style is not directly assessed in BIP, 
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Figure 3. Learning activities profiles for a) BIP, b) the US? tutor, and c) Smithtown; solid bars 
represent the proportion of time the particular skill (defined by the environment x Scnoi^dcdge type 
cell task) is exercised the tutor, relative to other skills; striped bars represent the proportion of 
time the skill is tested, relative to other skills* 
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An approach to the second question, concerning mdlcator comprehensiveness, is suggested by 
Rgure 3a: Which skills are being exercised and not tested? First, we can see that although students 
are learniag rules, they are not tested for them. This could be remedied by including operator tracing 
or selection tests. Second, students also are probably acquiring some general rules and skills regarding 
program writing stratepes, but BIP does not directly test for these. Transfer-of-training tests inserted 
mto the program (as part of the curriculum) would help determine the generality of the skills learned 
in BIP. Third, students read text, and get tested on their knowledge during the posttest, but it would be 
possible to test the propositional and schematic knowledge resulting from reading the text more 
directly by administering sentence verification tests, sorting tasks, and the like (see Table 2). Finally, 
the task of writing programs is an operator selection task and thus is more difficult than a task that 
would reqmre students merely to understand tfce workings of a program (an operator tradng task). 
Students may understand a program they are unable to write. The inclusion of a program 
understanding task would tap knowledge that would be missed othenwse and thus, should enhance the 
accuracy of the student model 

In sum, BIP generates many mdicators of student status and learning progress. Application of the 
taxonomy suggests a number of additional ways in which a student's knowledge and learning skill could 
be assessed* Expanding the breadth of learning skill probes should affect the overall quality of any 
mtelligent tutoring system, both in its role as a training device and as a research tool The performance 
of an ITS with a student-modeling component is highfy dependent on the quality of the student model 
insofar as the system's main job is to select appropriate-level problems. Thus, an ITS should improve 
with a better student model, and we made suggestions here for refining a student model As a research 
tool, an ITS can serve as an environment in which to examine the interrelationships among learning 
skills and learning activities. Snow et al.'s analysis of BIP relied on a rich set of learning indicators. 
But we thick that the taxonomy can be used to provide an additional psychological basis for expressmg 
those mdicators. 
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(2) Anderson's LISP Tutor 

General System Description 

Anderson and his research group have developed intelligent tutoring systems for geometry, algebra, 
and tfao programming language LISP* We focus here on the USP tutor* Descriptions of the tutor arc 
available (Anderson, et aL, 1985); thus, we only summarize some of the mam features of the system- 
espedally as they contrast with BIP* 

ITic LISP tutor follows the generic architecture fairly closely* Students read some material in a 
textbook, but then go on to spend most of their time interacting with the progrsin* The program 
selects problems, gives the student help or ad^ace when asked, and interrupts if the student is 
floundering. 

An innovation of the USP tutor b its use of vAiat Reiser, Anderson, and Farrell (1985) called the 
model-tracing methodology, the process by %^duch the tutor understands v^Aat the student is trying to do 
while the student attempts to solve a problem* Whenever the student types in an ejqpression (as part of 
a solution attempt), the tutor evaluates the expression as to whether it is the same as \^at the ideal 
student would type in, or whether it indicates a misconception (or bug)* If a misconception is 
indicated, the tutor intervenes with advice* 

For a tutor to analyze the student's response so microscopically, it has to know essentially every 
correct step and every plausible wrong step in every problem. The USP tutor does not incorporate 
enough domain knowIed|ge to be able to interpret every action a student might take, but it does have 
enough kno^edge to be able to interpret all correct solutions and approximately 45% to 80% of 
students* errors (Reiser et aL, 1985). (In cases \^tere the tutor cannot interpret a students behavior, it 
typically probes the student with a multiple*choice question*) When the USP tutor poses a problem, it 
goes about trying to sohre the problem itself simultaneously with the student It solves the posed 
problem ^th its own production system, v^ch consists of approximately 400 production rules for 
correctly writing programs (Anderson, 1987b)* It also solves the problem in various plausible incorrect 
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ways, through the action of about 600 bconcct or buggy production rules. Determining what the 
student is domg is a matter of comparing student mput with its internal production system results. 

Learning Indicators 

The USP tutor keeps a Jrecord of the student's status vwth respect to each skill bemg taught, where 
skills are the 400 correct production rules. An mdicator of how well the student knows a rule is 
mcremented when the student uses the rule correctly, and decremented y/hcn the student makes an 
error. Remedial problems may be selected to ^e a student e}5>erisnce in using a troublesome rule. 

Unfortunately, studies have not been done on the relationships among learning mdicators and 
outcomes. Most of the evaluation studies have simply compared LISP-tutored students with 
classroom- or human-tutored studen^^ on a standard achievement test administered at the end of the 
course. However, one study did mvestigate mdividual differences in acquisition' and retention of 
individual productions over a series of 10 lesson-sesdons (Anderson, in press), hi this analysis, each 
production was scored for the number of times it was used inconectly in problem sclwng; separately 
for each ses^oo. A series of factor analyses was performed on the?e data to determine vAitthcr 
production factors would emerge. Fcsx example, it could be that producticns assodated viith one land 
of learning (e.g,, le^^Tiung to trace functions, planning) would form a factor separate from some other 
kind of learning (e.g., learning to select f inctions, coding). Or lesson-spedac factors could have 
emerge^ In fact, Anderson found evidence for two broad factors: An acquisition factor captured 
individual differences m speed of production acquisition, and diTetention factor captured individual 
differences in the likelihood \i^t acquired productions were retained m a later session* 

Applying the Taxuuomy 

Consider first how we might classify the LISP tutor. Students spend most of their time learning 
specific production rules and stills and are contmually tested for their ability to apply them in wridng 
USP functions. Every student actinn can be viewed as a test response becatise the system is 
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mterpreliiis that response as an indication of whether the student knows a particular production rule* 
Thus, learning and testing activities in the LISP tutor are almost completely integrated 

Although students are learning inso^ as writing functions is a multiple operator selection 
task, the US? tutor is testing for students' knowledge of the rules underlying those skills* But this 
merely reflects the fact that skills m the USP tutor are defined predsely in terms of their constituent 
rules. Interesting^, the fact that the LISP tutor can represent a student's skill without directly 
evaluating that skill (le^ the system never evaluates whether the function works, per se) is evidence 
against the taxonomy's supposition of skill as a separate knovriedge type* However, this presumes a 
rule-level understanding of skill In domains for which such a detailed understanding is not yet 
available (most domains imaginable at this time), skill probably ought to be considered a functionally 
distinct category, even if only for pragmatic reasons* 

The instructional environment is one in which students learn initially throu^ brief instruction (a 
pamphlet or a textbook), but then go on to compile and refine that knowledge by enga^ng m extended 
problem solving. Hgure 3b summarizes our assessment of what learning skills are being exercised and 
tested in the USP tutor* 

Note that in addition to indicating that students are learning declarative knowledge by instruction, 
and procedural kno\^edge by compiling and practicing it, we have indicated other learning products 
and sources* The other products are the general rules and sidlls probably being taught by the USP 
tutor, even though that is not a goal for the tutor* The other sources have to do with the fact that the 
USP tutor is capable of delivering contejct-sensitive tutorial advice and, throu^ its coaching 
capabilities, can readify change the nature of the instructional enwonment On one occadon it might 
correct a student's attempt through direct instruction, but then it might later suggest an analogy to a 
student, or provide examples of a concept 

Now consider the testing comprehensiveness issue* As can be seen in Figure 3b, we consider all of 
the USP tutor's testing to be for Rule knowledge, either in the Compilation or the Drill and Practice 
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environments* (Wc could also consider Automatic Skills to be tested, but that would require a rather 
detailed analysis of the US? tutor's entire production collection for how big» compiled productions 
subsume their smaller precursors,) Note that first, as with BBP, students' success at propositional 
learning and their ability to acquire general rules and skills are not tested This situation could be 
remedied with the insertion of sentence verification and transfer-of-training tests* But a more 
mtriguing suggestion from the standpoint of research derives from the fact that the LISP tutor's multi- 
faceted coaching capability, \^ch offers various kinds of tutorial remediation, greatly expands the 
range of learning event hat may be mvestigated. For example, it would be possible (and mteresting) 
to keep track of production strength modification separately for each of the various instructional 
environments. That is, one could trace the growth in rule indicators over time as a function of whether 
those rules were taught (or remediated) with instructional advice, analo^es, examples, and so on. One 
could ask the question of whether instruction using analo^es results in greater subsequent ability to 
use the rule(s) so instiucted, for example* 

In summary, because of the way in ^ch it models students' knoi;^edge as production rules, and 
carefully controls the learning environment, the LISP tutor is ideally suite d for measuring learning 
skills such as the rate at v/bxch productions are composed, or the probability of compiling a sequence of 
productions as a function of e?q)osure to that sequence* Augmented with the additional tests and 
performance records suggested by the application of the taxonomy, the USP tutor could serve as an 
excellent research tool for investigating the time course of learning and individual differences therein* 

(3) Smithtown: Discovery World for Economic Principles 

General System Description 

Unlike the other two systems, Smithtown's main goal is to enhance students' general problem* 
solving and inductive learning skills* It does this in the substantive context of microeconomics in 
teaching the laws of supply and demand (Shute & Glaser, in press)* Szi:ithtown is higbty interactive* 
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students pose questions and conduct experiments ^vithm the computer environment^ testing and 
enriching their knowledge of functional relationships by manipulating various economic factors. 

As a discovery environmenti Smithtown is quite different from BIP and the LISP tutor in that there 
is no fixed curriculum. The student-not the system-generates problems and hypotheses. After 
generatmg a hypothesis (such as "Does increasing the price of coffee affect the supply or demand of 
tea?**), the student tests it by executing a series of actions, such as changing the values of two variables 
and observing the bivariate plot. This series of actions, or behaviors^ for creating, executing, and 
following-up a ^ven experiment, defines di student solution. 

Despite having no curriculum, Smithtown does have the instructional goal of teaching general 
problem-sohdng rules and skills (cslltdgood critics) such as "collect baseline data before altering a 
variable" or "generalize a concept across two imrelated goods." Instead of a curriculum guiding 
instructional decisions, Smithtown relies on a process of constantly monitoring student actions, looking 
for evidence of good and poor behavior, and then coaching students to become more effective problem 
solvers. The system keeps a detailed history list of all student actions, grouping them into (i.e., 
incerpreting them as) behaviors and solutions. Smithtown diagnoses solution quality in two ways. It 
l(oks for overt errors by comparing student solutions with its bu^ critics^ which are sets of actions (or 
non-actions) that constitute nonoptimal behaviors (e.g., "fail to record relevant data in the online 
notebook"). It also compares student solutions with its ovm^)od critics (expert solutions). 
Discrepances between the two are collected into a list of potential problem areas and passed on to the 
Coach for possible remediation. To illustrate, if the student failed to enter data into the online 
notebook for several time frames and had made some changes to variables, the system would recognize 
this as a deficient pattern and prompt the ftudent to start using the notebook more consistently. 

Smithtown's student model is based on two statistics: (a) the number of times the student 
demonstrates a buggy critic (enors of conunission), and (b) the ratio of the number of times the 
student uses a good critic over the number of times it was applicable (errors of omission). G3aching is 
based on the heuristic of first advising about buggy behaviors, then advising on ^ay blatant enors of 
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omissioxL Advice is always given in the context of a particular esqpcriment, so> like the US? tUtor, it is 

context-sensitive. For example^ the coach might say> 

You haven't graphed any data yet and I think you should try it out. This is often a good 
way of viewmg data. It lets you plot variables together and some surprismg relationships 
may become apparent. 

However, the coach is fairly unobtrusive: After advice is given, there is no further coaching for some 
time. 

Smithtown also knows about variable relationships that constitute economics prindples (such as 
Trice is inversely related to quantity demanded"). If a student uses the system's hypothesis menu and 
states this relationship (e.g., "As price increases, quantity demanded decreases"), the student is 
congratulated and told the name of the law just discovered (e.g., "Congratulations! You have just 
discovered what economists refer to as the Law ofDemaniT). 

Learning Indicators 

Shute, Glaser, and Raghavan (m press) conducted an extensive evaluation of differences among 
students in what the students learned and how they mtcracted with Smithtown. Two data sources were 
used: a Ust of all student actions, and a set of verbal protocols m >;^ch students justified their actions 
and predicted outcomes of the actions. 

Table 4 shows a set of 29 learning indicators constructed for analyang individuals* performance. 
Indicators are clustered into three general behavior categories: (a) ac&nty - exploratory level skills 
(mdicators relating to activity level and exploratory behaviors), (b) data management level skills 
(indicators for data recordmg, efiSdent tool usage, and use of evidence), and (c) thinking and planning 
level skills (mdicators for consistent behaviors, effective generalization, and effective experimental 
behaviors). 

Shute et aL's sample (N = 10) was too small to analyze formally, but the mdicators were exammed 
for which ones discriminated successful from unsuccessful learners. Two subjects**one vAxo performed 



39 



ERIC 




Table 4. Learning Indicators from Smithtown, the Economics Tutor 



ACnVITY/EXPLOKATORY LEVEL SKILLS 
I. ACTIVITY LEVEL 

1. Total number of actions 

2. Total number of experiments 

3. Number of changes to the price of the goods 

IL EXPLORATORY BEHAVIORS (Counts; i.e., number of 

4. Markets investigated 

5. Independent variables changed 

6. Computer-adjusted prices 

7. Times market sales information was viewed 

8. Baseline data observations of market in equilibrium 

DATA-MANAGEMENT LEVEL SKILLS 
IIL DATA RECORDING 

9. Total number of notebook entries 

10. Number of baseline data entries of market in equilibrium 

11. Entry of changed independent variables 

IV. EFFICIENT TOOL USAGE (Ratios of number of effective uses over number of uses) 

12. Number of relevant notebook entries / total number of notebook entries 

13. Number of correct uses of table package / niunber of times table used 

14. Number of correct uses of graph package / number of times graph used 

V. USE OF EVIDENCE 

15. Number of specific predictions / number of general hypotheses 

16. Number of con ect hypotheses / number of hypolheses 

THINKING AND PLANNING LEVEL SKILLS 

VI. CONSISTENT BEHAVIORS (Counts; i.e., number of ...) 

17. Notebook entries of planning menu items 

18. Notebook entries of planning menu items / planmng opportunities 

19. Number of times variables were chained that had been specified beforehand in the 
planning menu 
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Table 4, Learning Indicators Jrom Smithtown (cont) 



VII. EFFECTIVE GENERALIZATION (Event counts; Le., number of times ...) 

20. An experiment was replicated 

21. A concept was generalized across unrelated goods 
22« A concept was generalized across related goods 

23. The student had sufficient data for a generalization 

VIIL EFFECTIVE EXPERIMENTAL BEHAVIORS (Event counts; i.e., number of times ...) 

24. A change to an independent variable was sufficiently large 

25. One of the experimental frames was selected 

26. The prediction menu was used to specify an event outcome 

27. A variable was changed (per experiment) 

28. An action was taken (per experiment) 

29. An economic concept was learned (per session) 



poorly on the pretest but well on the posttest (a successful learner), and one who who did poorly on 
both tests (an unsuccessful leamer)-were seiocted for more careful scrutiny. 

The two subjects differed mostly on indicators of thinking and planning skills (i.e., effective 
experimental behaviors). In particular, the better subject collected and organized his data from a more 
theory-driven perspective, which contrasted with the more superficial and less theory-driven approach 
used by the poorer subject The bett^r subject generalized concepts across multiple markets (which the 
poorer subject did not do), engaged in xcort investigations within a given market, and did not move 
randomly among markets as did tic poorer h jbjtct. The better subject also made large changes to 
variables so that any repercussions could be detected. This contrasted with typirrkily smaU changes 
made by the poorer subject, ^o justified her choicer by claiming they were more "realistic'* 
Replicatmg experiments to test the validity of ' resulrs is an important scientific behavior.and.similar to 
BIFs Indicator 15. The better si;bject consdentiousiy replicated ejqjeriments whereas the poorer 
subject did not One other indicator, data managemei^i skills, distinguished the two subjects. The 
better subject recorded more notebook entries, and the ones that he recorded consistently Included 
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relevant variables from the planning menu. The poorer subject used the notebook sporadically and 
often failed to record important information* 

Appfying the Taxonomy 

Again, wc first consider the classification of Smithtown. Knowledge types taught are primarily 
general skills (i.e., learning effective inquiry strategies for a new domam), domain-specific skills 
pertaining to economics knovdcdgc^ and domam-specific mental models of the functional relationships 
among microeconomic factors. Students also are presumed to acquire some declarative knowledge and 
rules about economics while interacting with the environment. The instructional environment is a 
discovery microworld and thus most of the learning that occur J results bom students indudng 
knowledge and skills through observation and discovery, then perhaps compiling those skills by 
practicing them in the conduct of experiments. There is tutorial assistance if a student is judged to be 
floundering in discovery mode, however; we indicate this in Figure 3c as learning proportions and skills 
by direct instruction* Figure 3c shows that in overall emphasis, Smithtown is quite distinct in both goals 
and approach from BIP and the LISP tutor. 

Regarding the issue of testing comprehe osiveness m Smithtown, we consider two kinds of tests: (a) 
the otdine indicators used by the system in diagnosis, and (b) the separate posttest that measures 
economics knowledge gained during the tutorial For the purpose of filling out Hgure 3c, we 
considered half the total testing to be online and the other half to be the posttest; the striped bars are 
marked as to the testing source. Hgure 3c shows that as in the LISP tutor, the online indicators 
primaril;^ jflect rule and skill knowledge, but in Smithtown, the testing context is the discovery 
environment AnotLer key difference is that the rule and skill kno^edge is not related to the 
economics domain but rather, to the subject's ability to manipulate the environment and use its tools to 
test hypotheses. The posttest did tap domain knovdedge. One part of the pattest battery was a 
multiple-choice test that measured declarative knov^edge. A second part was a ""scenarios test" that 
had subject reason through various economics scenarios. The scenarios test iUustrates a means for 
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assessing mental models; it was designed to assess students* ability to run mental simulations of 
complex economics scenarios (see Shute & Glaser, in press> for a detailed discussion of the test). 

Figure 3c suggests that perhaps the greatest mismatch between what learning skills were exercised 
and what were tested occurs in the General Rule and Skill cells. A shortcoming of the Smithtown 
evaluation is that one of its stated primary goals is to help students become more effective in 
conducting experiments in a microworld environment, acquiring general skills as a result of their 
investiptions. But this instructional goal was measured only mdircctly on the posttest, which relied on 
declarative tests of economics knowledge. A more direct assessment of the degree to which the stated 
goals could be reached would require a transfer of skills in a system structured similar to Smithtown 
but containing different domam knowledge (interestingly, there is such a system, but the transfer 
experiment has not yet been conduaed). Truly general inqmry skills developed in Smithtown would 
presumably transfer to the new environment. 

Another smaller mismatch is that declarative knowledge of basic economics principles tested at 
posttest, but not while students were interacting with the tutor. It seems reasonable, both from a 
research standpoint and from the standpoint of enhancing the student model, to integrate declarative 
knowledge tests with tutoring. 

A major factor mi&sing here and throughout our discussion of the three tutors is the style 
dimension^ Inspection of Table 4 Jiows that the set of indicators Smithtown coUeUs and monitors are 
really not direct indicators of leanung skill per se but rather, are style mdicators in the sense that they 
reveal how an individual organizes his or her learning environment. From this perspective, a key 
question addressed in the Shute et aL analysis had to do with style interrelationships (the 
"dimensionality of stjie" question) and the relationship between st^e and learning outcome (the validity 
question); in one sense,- this is exactly the study needed to understand leaming skills in the mc^ 
natural, ecologically valid context. It is also a preliminary question to one of the goals we arc pushing 
for here: tobcable to assess basic learning skilk, controlling for learning style. Smithtown may be 
best suited for analysis of the style issue. But before style variables are better understood, more 
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structured environments such as 6IP and the USP tutor, ^ch by forcibly directing learning activities 
designate a less impo^ant role for individual variability in learning style, may be more conducive to 
research on basic lean^ing sidlls. 

V. LEARNING INDICATORS FOR VALIDATION STUDIES 

To this pomt, wc luive discussed how the taxonomy might be applied so as to enable a more 
thorough evaluation of student leammg skills and outcomes. The applications discussed above might 
have the flavor of suggestions for improving the tutors. That is not the mtention. We s^e the main 
function of the taxonomy as primarily a research one. By more thoroughly examining what students 
learn m instruction, it should be possible to conduct more-refined studies on mdi^dual differences in 
learning. Snow et al. (1986) generated and analy^^d a set of learning mdicators, Anderson (in jtcss) 
did a similar analysis, and a similar analysis is underway for Smithtown« Our claim is that the taxonomy 
should suggest additional ways m >^ch to record learning skills, and this should result m a 
psycholo^cally rich and principled set of additional learning mdicators. Each cell in the full four- 
dimensional taxonomy defines a proposed learning skiD. An important next question, open to 
empirical investigation, has to do with the true reduced-space dimensionality of learning skills (5^ 
footnote 1). From an mdividual differences perspective, how many learning abiliti'^ must we posit, and 
at what level of detail, to characterize skill Terences among learners over all taxonomy cell tasks? 

There L« ' a second, related application. The taxonomy should help us develop for instructional 
programs iu./ning indicators that can serve as criteria against which other mdividua! difference 
measures, such as aptitude and basic abiUties tests, might be validated That is, our taxocomy-derived 
indicators can serve as supplements or even replacements for the conventional criteria of post-co)^se 
achievement tests, course grade-point-avcragc, on-the-job performance tests, and supervisor/teacher 
ratings, m the conduct of construct validation studies. Indeed, it was this goal of creating more 
extensive criteria against which ne^ ' ai^de tests might be validated that led us into the taxonomy 
project m the first place. 
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Learning Abilities Measurement Program (LAMP) 



Over the past several years, the Air Force has supported a program of basic research designed to 
explore the possibility of using contemporary cognitive theory as the basis for a new system of ability 
measurement (Kyllonen, 1986; Kylloncn & Christal, m press). Currently, the Air Force, as well as the 
other Services, selects and assigns applicants at least partly on oie basis of their performance on a 
conventional aptitude battery, which includes tests of reading comprehension, arithmetic reasoning, 
numerical operations, and so font The goal of the Learning Abilities Measurement Program 
(LAMP) is to provide the research base that might lead to supplementing or even replacing those 
conventional tests with new measures more closely aligned with an information processing perspective. 

What might these new tests be? The project has thus far mvcstigated measures of working memory 
capacity, information processing spc*-d, bre^:d*h and depth of dcdarativc kno\;rfedge, availability of 
strategic knowledge, and otfc^r such abilities. It would go beyond the scope of this chapter to revie'V 
the project's research (see Kjiionen, 1986; Kyllonea & Christal, m press, for current reviews) but the 
prototypical study investigates the relationship among various kinds of co^^tive mr ^xl'cs (such as 
workmg memory capacity) and learning outcome measures (list recall) under variow^ iiistructional 
condition^ {mch as variations in study time). 

A major focus of che research is exauLiining the relationships between ability measures and Icamicg 
outcomes. But the range of learning outcomes mvestigated thus far, not only on our project hut on 
others' as well, has been quite limited, in two ways. First, the range ot lemming skills exammed has 
been rather narrow; this is especially apparent given the breadth of potential leaniing skills suggested 
by the taxonomy* But second, and perhaps c /en more importaaiy, the learning t£^ we have 
employed have tended to be short-term laboratory tasks, and therefore may not be truly representative 
of real-world learning activities. This inhibits the transition of research to application, insofar as 
generalization from narrow laboratory tasks to real-worid learning tasks is tenuous. And as Greeno 
(1980) has argued, use of ecologically valid learning tasks is defensible from the s;andpomt of leading 
to better basic research as well 
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Thus, for both applied and theoretical reasons, a decision was made recently to expand the range of 
learning criteria employed A laboratory has recently been built at Lackland <\ir Force Base that 
accommodates 3G work stations capable of administering mtelligent computerize^ instruction such as 
that reviewed previously. Intelligent tutoring systems in the domains of computer programming, 
electronic troubleshootmg, and flight engmeering have been developed or are currently underway. 
Over the next several years, we will mvestigate learning on these tutors and conduct studies that 
examine the relationships among basic cognitive abilities and various learning skills and outcomes. V/e 
expect the taxonomy as described here to assist us m developing learning indicators for the tutorial 
environments. 

Appfying the Taxonomy: A Practical Guide 

Thus, we are employing a two-pronged approach in generating learning skill indicators for LAMP 
validation studies. We design instructional programs capable of produdng rich traces of learner 
activities, then we mtend to analyze and categorize those activities so as produce psychologically 
meaningful learning mdicators. Tables 5 and 6 present the general outline for our approacL Note that 
wc have written the design and analysis steps m such a way as to be broadly usefiiL Although our 
application is m the design and (especially) analysis of mtelligent tutoring systems, the steps suggested 
could be adapted to any kind of mstructional system, computerized or even classroom. 

VI. SUMMARY AND DISCUSSION 

We have presented a taxonomy of learmng, based on previous research and on contemporary 
cognitive theory. We have also proposed how the taxonomy can be applied to generate indicators of 
what a student m an instructional situation is learmng, and how well he or she is learning it But just 
how weU does our proposed taxonomy-mdicator system work? 

Consider four major uses for the system (these and a fifth research application are listed in Table 
7). First, the taxonomy can suggest what kinds of skills are being exerdsed and tested in an 
instructional settmg. In this capacity, the taxonomy serves in much the same way Bloom's or Gagne's 
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Table 5. Applications of the T<^xonomy: Suggestions for Design 



INSTRUCTIONAL SYSTEM DESIGN STEPS 



h Determine desired knowledge outcomes: 

a. State the instructional goals (e.g^ acquisition of a mental model, a set of propositions, a 
set of skills). 

b. Specify the particular focts/skills/mental models to be taught 

c. Determine tests to be used for assessing particular knov^dedge outcomes (Table 2). 

2. Determine environment for achieving knowledge outcomes: 

a. Consider the kind of leamiiig strategy desirable to invoke (Table 1). 

b. Consider alternative means for achieving knovrfedge outcome (could be used as a 
remediation strategy, or simply as a variation to avoid instructional monotony). 

c Record student learning successjMth respect to the knov^edge-outcome-by- 
instructional-environment matrix. This allows more predse statements of the 
e^ectiveness of the instruction. 

3. Consider learning st^e issues: 

a. Consider whether to encourage particular types (styles) of interaction. 

b. If learning st^e is left free, make provisions to record the manner in which the student 
interacts with the ias:-ucUonal environment (for suggestions sec Tables 3 and 4). This 
also allows more precise statements of the effectiveness of Che instruction. 

c If particular learning st^es are encouraged through feedback and suggestions, consider 
varjing the kinds of styles encouraged so as to allow experimental comparisons of the 
relative effectiveness of various styles. 
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Table 6. Applications of the Taxonomy: Suggestions for Anafysis 



LEARNING TASK ANALYSIS STEPS 

1. Detennme the knowledge outcome goals for the instnictioD: 

2u Deternsine the nature of the stated instrucdona! goals (e.g^ acquisition of a mental % 
model, a set of propositions, a set of skills). 

b. Determine what kinds of tests are embedded within the instruction (consulting Table 2). * 

c Determine the match between the tests used and the knowledge outcomes intended (as 
in Hgurc 3). 

2. Determine the nature oftheinstrucdonal environment: 

2u For every instructional exchange (every student-instructor interaction episode), consider 
what -learning strategy is invoked (consulting Table 1) during the exchange. Generate 
learning actives profiles fof the entke instructional program (as in Figure 3). 

b. Organize records of student learning success with respect to the knowledge*outcom vby- 
instructional-environmeut (KO x IE) matrix. That is, depose a means for assigning each 
student a separate learning success score fcr each cell in the KO x IE matrix. Scores 
would be based on tests following particular instructional exchanges. 

3. Consider leammgst^e issues: 

2u Consider whether particular ^pes (styles) of interaction are encouraged. 

b. If learning s^e is left free, and there is between*student sXylt variability, but no within- 
student style variability, then separate students by st^e before conducting any analyses of 
the KOx IE matrix. 

c. If learning style is left free, and there is withiii*student style variability (e.g., students 
engage in holistic prccesdng some times, serial processing at others), aeate separate 
KO X IE profiles separately fov the various st^e orientations. 

4. Considerations for transfer studies: ^ 

a. Degree of transfer should be a function of the similarity of the learning activities profiles ^ 
for two learning tasks. 

b. Similarity is computed over the KO x IE matrices (possibly for separate st^es), and 
domain. 
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Table 6, Applicaticns oftJte Taxonomy (cont) 



5. Considerations for opdmmng or predicting global outcomes: 

a. Expected global outcome for a particular student ^fnU depend on the match between the 
student's personal learning skill pro&Ie and the learning skills the instruction exercises 
(the learning activities profile, Figure 3). 

b. Optimiang ^obal outcomes for a particular student can be seen as a linear 
progr amming problem* Instruction should maximize exerdsing the student's strongest 
skills subject to the cost (e^ in time) for exerdsing those sifi'llg 



Table 7, Applications of the Taxonomy: What It Can Be Used For 



INSTRUCTIONAL SYSTEM EVALUATORS 

(Teachers andAdfnimstrators) 

' FadUtatesanafysis of what kinds of leaniingsldlls are being exerdsed and tested m 
instructional setting (see Figure 3) 

INSTRUCTIONAL SYSTEM DESIGNERS 

- Suggests a range of possible instnictionai environments for achieving particular knov^edge 
outcomes (see Table l/Figurc 1) 

- Specifies techmques (tests) for probing a wide range of knovrfedge and learning stall 
outcomes (see Table 2) 

COGfJmVE RESEARCHERS 

- Suggests predictions about transfer relations among learning experiences (see Figure 
1/Tabled) 

- Suggests indicators (dependent variables) of what and how well a student is learning (see 
Figure3/Tables2,(J) 



ERLC 



49 

58 



taxonomies do. The advantage to our proposal is that it is more closely tied to current cognitive theory, 
which we hope will enable us to apply the system more easily in analyzing learning m routine 
instructional settings* A second use for the system concerns primarily the environment dimension* 
Thr. specification of multiple instructional environments permits the assessment of a range of means 
for achiemg particular knowledge outcomes. If an instructor's goal is to teach a mental model of some 
system, the instructor can simpfy instruct it, or use an analogy, or have the student discover the model 
through observation of the system, and so on* A third use for the system is to make predictions about 
transfer relations among learning e?q)eriences. We would predict that the closer, taxonomically, two 
learning situations are, the more likely that whatever is learned in one will transfer to the other. Of 
course, this is an open empirical quesdon* A benefit of the taxonomy is that it suggests a 
straightforward research program for addressing this kind of question* 

While all three of these applications may be useful, we believe ^hat the most important role of the 
taxonomy is in establishing the means for probing a much vnder range of knov^edge and learning skill 
outcomes* This capability is obviously important for research purposes, but it is also important for 
evaluating educational systems. Consider a general problem in evaluating innovative educational 
programs (discussed by Nickerson, Perldns, & Smith, 1985). Over the years, many such programs- 
such as ones for teaching aeative thinking or ones for teaching general thinlfing skills**have been 
developed* All too often, casual observation suggests that such programs are ha^ong desirable effects 
on students, but such effects do not show up under the scrutiny of carefully conducted evaluation 
studies* Creators of such programs typicalfy comphun that the scientific model of evaluation is 
mappropriate because the true gains students experience are somehow missed* One role for the 
taxonomy might be to suggest how additional learning outcomes and skills can be assessed tn order to 
enable a more thorough evaluation* 

Even among the three instructional programs we reviewed here, a rather conservative approach to 
assessing the impact of the tutoring system was taken. To some extent, the LISP tutor, BIP, and 
Smithtown all depend on standard achievement outcome tests as a means for their validation* Though 
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it is important to establish that these tutors do affect overall achievement, it is not sufHdent While 
interacting ^th a tutor, or in any instructional enwonment, students can be learning many different 
things. A major role for the taxonomy is to suggest a richer testing system for evaluating a broader 
range of student outcomes. 

Finally, the taxonomy-indicator system should facilitate pursuit of both applied and basic research 
questions. Our major practical application for the taxonomy is to have it assist in the spedfica'don of 
variables that indicate what and how well a subjea is learning as the subject interacts with a tutor over 
a lengthy series of lessons. These variables then will serve as criteria against which newly developed 
measures of cognitive ability will be validated, Additionalfy, a wide range of basic research issues 
emerges. Are the different knowledge types affected by the same variables? Arc fast propositional 
learners also fast production rule learners? Are there interactions between knowledge type and the' 
instructional enwonment? Are individual differences in learning more dependent on the knowledge 
type or the enwonment? Our research programs are only at the ver> beginning stages in addressing 
these kinds of fundamental questions about the nature of learmng and individual differences tbereia 
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